A Semantic Case-Based Reasoning Framework for Text Categorization

نویسندگان

  • Valentina Dragos
  • Sylvie Desprès
چکیده

This paper presents a semantic case-based reasoning framework for text categorization. Text categorization is the task of classifying text documents under predefined categories. Accidentology is our application field and the goal of our framework is to classify documents describing real road accidents under predefined road accident prototypes, which also are described by text documents. Accidents are described by accident reports while accident prototypes are described by accident scenarios. Thus, text categorization is done by assigning each accident report to an accident scenario, which highlights particular mechanisms leading to accident. We propose a textual case-based reasoning approach (TCBR), which allows us to integrate both textual and domain knowledge aspects in order to carry out this categorization. CBR solves a new problem (target case) by identifying its similarity to one or several previously solved problems (source cases) stored in a case base and by adapting their known solutions. Cases of our framework are created from text. Most of TCBR applications create cases from text by using Information Retrieval techniques, which leads to knowledge-poor descriptions of cases. We show that using semantic resources (two ontologies of accidentology) makes possible to overcome this difficulty, and allows us to enrich cases by using formal knowledge. In this paper, we argue that semantic resources are likely to improve the quality of cases created from text, and, therefore, such resources can support the reasoning cycle. We illustrate this claim with our framework developed to classify documents in the accidentology domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic message annotation and semantic interface for context aware mobile computing

I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public; therefore I authorise Brunel University to make available electronically to individual or institutions for the purpose of scholarly research. Abstract III...

متن کامل

A rough set-based case-based reasoner for text categorization

This paper presents a novel rough set-based case-based reasoner for use in text categorization (TC). The reasoner has four main components: feature term extractor, document representor, case selector, and case retriever. It operates by first reducing the number of feature terms in the documents using the rough set technique. Then, the number of documents is reduced using a new document selectio...

متن کامل

Preventing Failures by Mining Maintenance Logs with Case-based Reasoning

The project integrates work in natural language processing, machine learning, and the semantic web, bringing together these diverse disciplines in a novel way to address a real problem. The objective is to extract and categorize machine components and subsystems and their associated failures using a novel approach that combines text analysis, unsupervised text clustering, and domain models. Thr...

متن کامل

Dynamic Categorization of Semantics of Fashion Language: A Memetic Approach

Categories are not invariant. This paper attempts to explore the dynamic nature of semantic category, in particular, that of fashion language, based on the cognitive theory of Dawkins’ memetics, a new theory of cultural evolution. Semantic attributes of linguistic memes decrease or proliferate in replication and spreading, which involves a dynamic development of semantic category. More specific...

متن کامل

Machine Reading

Over the last two decades or so, Natural Language Processing (NLP) has developed powerful methods for low-level syntactic and semantic text processing tasks such as parsing, semantic role labeling, and text categorization. Over the same period, the fields of machine learning and probabilistic reasoning have yielded important breakthroughs as well. It is now time to investigate how to leverage t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007